Deliverable #2 - ontology items in, matrix out

Extending this idea, it should work as follows: identify regions in ontology, pass ontology ids of inputs & outputs to webserver, get back either a) matrix of single voxel values or b) visualization of expression heatmap

Let's make this easier:

You give me a list of acronyms from the AIBS ontology (http://atlas.brain-map.org/)

For inputs

Query API for list of injections in the anatomic region
Download list of expression grid data for injection set

For outputs

Convert the acronym to ontology id
Extract the corresponding branch of the ontology
Flatten that branch into a list of ontology ids
Mask the annotation volume to generate a set of coordinates

Extract expression values from output coordinates and place into matrix

Bonus points: organize by ontology order and generate colored sidebars as seen in Figure 3.


In [1]:
import zipfile, io
import matplotlib
from matplotlib import cm
import numpy.ma as ma
import json
import requests

In [2]:
### Example regions
# LAT -> MOs
input_names = ['LAT']
output_names = ['MOs']

In [5]:
# Query via structure search

structure_search = 'http://api.brain-map.org/api/v2/data/query.json?criteria=service::mouse_connectivity_injection_structure[injection_structures$eq%s][primary_structure_only$eqtrue]'
s_query = structure_search % (input_names[0])

experiment_list = requests.get(s_query).json()['msg']
print 'Found %d experiments with %s as the injection site' % (len(experiment_list), input_names[0])


Found 22 experiments with LAT as the injection site

In [64]:
# Here's what a single experiment looks like
experiment_list[0]


Out[64]:
{u'gender': u'M',
 u'id': 267999740,
 u'injection-coordinates': [7600, 3600, 6900],
 u'injection-structures': [{u'abbreviation': u'APN',
   u'color': u'FF90FF',
   u'id': 215,
   u'name': u'Anterior pretectal nucleus'},
  {u'abbreviation': u'LP',
   u'color': u'FF909F',
   u'id': 218,
   u'name': u'Lateral posterior nucleus of the thalamus'},
  {u'abbreviation': u'DG',
   u'color': u'7ED04B',
   u'id': 726,
   u'name': u'Dentate gyrus'},
  {u'abbreviation': u'PO',
   u'color': u'FF909F',
   u'id': 1020,
   u'name': u'Posterior complex of the thalamus'}],
 u'injection-volume': u'0.237307',
 u'name': u'Grik4-Cre-5129',
 u'num-voxels': 489053,
 u'strain': u'B6.Cg',
 u'structure-abbrev': u'PO',
 u'structure-color': u'FF909F',
 u'structure-id': 1020,
 u'structure-name': u'Posterior complex of the thalamus',
 u'sum': u'7.40048',
 u'transgenic-line': u'Grik4-Cre'}

In [67]:
#Now lets grab all of the expression volumes 

all_expression = []

for experiment in experiment_list[0:4]:
    print experiment['id']
    experiment_summary = {}
    
    # Download the corresponding expression grid
    example_api_query = 'http://api.brain-map.org/grid_data/download/%d?include=intensity,density' % (experiment['id'])
    r = requests.get(example_api_query)
    
    memfile = io.BytesIO(r.content)
    zfile = zipfile.ZipFile(memfile, 'r')
    for name in zfile.namelist():
        if name == 'density.raw':
            dt = np.dtype('<f') # little endian, float32
            rawArray = np.fromstring(zfile.read(name), dtype=dt)
            expressionArray = rawArray.reshape(115,81,133)   # as opposed to 67, 41, 58?    
            #imshow(expressionArray[:,:,20].T, cmap = cm.jet) # show coronal plane, rotated (transpose)
            
            experiment_summary = experiment
            experiment_summary['expression'] = expressionArray
            
            all_expression.append(experiment_summary)
            
#     if name =='density.mhd':
#         pass
#         print(zfile.read(name))


267999740
182805258
146658879
113846682

In [72]:
# visualizing the first of many expression volumes
imshow(all_expression[0]['expression'][:,:,20].T, cmap = cm.jet) # show coronal plane, rotated (transpose)


Out[72]:
<matplotlib.image.AxesImage at 0x10a860ed0>

Generating the output mask

We only care about voxels in the output anatomic region. This can't be done via API.

First, convert the acronym from the output region into its ontology id


In [17]:
# extract the branch of the ontology graph (json)
def extractOntology(jsn, target):
    if jsn['acronym'] == target:
        return jsn
    else:
        jsn_to_return = None
        for child in jsn['children']:
            ret = extractOntology(child, target)
            if ret:
                return ret
    return None    

# flattens the ontology into a list
def parseOntology(jsn, structures={}, gOrder=0):
    i = jsn['id']
    n = jsn["name"]
    p = jsn["parent_structure_id"]
    c = jsn['color_hex_triplet']
    a = jsn['acronym']

    struct = {"id":i, "name":n, "order":gOrder, "parent":p, "color": c, 'acronym': a}
    structures[i] = struct
    for child in jsn["children"]:
        parseOntology(child, structures, gOrder = gOrder + 1)

In [18]:
# Downloading the complete ontology via API
full_ontology_url = 'http://api.brain-map.org/api/v2/structure_graph_download/1.json'
full_ontology = requests.get(full_ontology_url).json()['msg'][0]

Extract the corresponding branch of the ontology, flatten it


In [75]:
# do the same thing for the output regions
output_ontology = extractOntology(full_ontology, output_names[0])

# flatten it
output_list = {}
parseOntology(output_ontology, output_list)

# the keys are the respective ontology ids for each anatomic region
output_ontology_ids = output_list.keys()

In [76]:
# and going to ontology regions:
print 'Outputs:', output_ontology_ids


Outputs: [993, 962, 1021, 656, 1085, 767]

Get injection coordinates for each ontology id


In [54]:
# grab the annotation volume from AIBS and unpack it. 
# We're using a cached version to minimize downloading the same thing
atlasVolume = fromfile("gridAnnotation_100micron/gridAnnotation.raw", dtype=uint32).reshape(115,81,133)  # (133, 81, 115)
imshow(atlasVolume[:,:,20].T, cmap = cm.jet) # show coronal plane, rotated (transpose)


Out[54]:
<matplotlib.image.AxesImage at 0x109f6b1d0>

In [77]:
# easy to figure out each ontology id at any index
atlasVolume[70, 40, 20]


Out[77]:
900

In [78]:
# extract output coordinates 
output_coordinates = []

for out_id in output_ontology_ids:
    v = numpy.where(atlasVolume == out_id)
    for c in xrange(len(v[0])):    
        coord = (v[0][c], v[1][c], v[2][c])
        output_coordinates.append([out_id, coord])
        
# just to demonstrate what we have now
print 'Example output coord:', output_coordinates[0]


Example output coord: [962, (31, 28, 28)]

In [74]:
# mask the density volume with output coordinates and write to a file

masked_array = numpy.zeros_like(atlasVolume)

csv_out = open('output.csv', 'w')

# column headers
csv_out.write('injection_onto_id, flatten_idx, x,y,z')

# experiment ids (we could use any thing from the experiment metadata here instead)
for exp in all_expression:
    csv_out.write(', %d' % exp['id'])

csv_out.write('\n')

for cc in output_coordinates:
    c = cc[1]
    flattened_c = c[2] * (115 * 81) + c[1] * 115 + c[0]
    
    csv_out.write('%d, %d, %d, %d, %d' % (cc[0], flattened_c, c[0], c[1], c[2]))
    for exp in all_expression:
        expressionArray = exp['expression']
        csv_out.write(',%f' % expressionArray[c[0]][c[1]][c[2]])
    
    csv_out.write('\n')              
                  
csv_out.close()

Unused code from earlier attempts


In [79]:
# A nice encapsulation of the annotation library

class AtlasDereference(object):
    def __init__(self):
        self.atlasVolume = fromfile("gridAnnotation_100micron/gridAnnotation.raw", dtype=uint32).reshape((115, 81, 133)).T # (133, 81, 115)

        self.ont = {}
        jsn = json.load(open("gridAnnotation_100micron/1.json"))
        parseOntology(jsn["msg"][0], self.ont)
        
        deepestNode = -1
        for struct in self.ont.values():
            if struct["order"] > deepestNode:
                deepestNode = struct["order"]

        # print deepestNodecolor_hex_triplet

    def idAtPoint(self, point):
        indx = around(point).astype(int)
        return self.atlasVolume[tuple(indx)]

    def nameAtPoint(self, point, level):        
        return self.infoAtPoint(point, level)["name"]

    def infoAtPoint(self, point, level):
        id = self.idAtPoint(point)
        info = self.ont[id]
        while info["order"] > level:
            info = self.ont[info["parent"]]
        return info

    def colorAtPoint(self, point, level):
        return self.infoAtPoint(point, level)["color"]

    def ontologyForAcronym(self, a):
        for k,v in self.ont.iteritems():
            if a == v['acronym']:
                return v
        return None
    

# helpful to convert colors to html-happy representation
HEX = '0123456789abcdef'
HEX2 = dict((a+b, HEX.index(a)*16 + HEX.index(b)) for a in HEX for b in HEX)

def rgb(triplet):
    triplet = triplet.lower()
    return (HEX2[triplet[0:2]]/255.0, HEX2[triplet[2:4]]/255.0, HEX2[triplet[4:6]]/255.0)

In [81]:
# instantiate
ad = AtlasDereference()

In [82]:
# ontology id at a voxel coordinate
ad.idAtPoint((40,40,40))


Out[82]:
672

In [83]:
# some metadata for the anatomic region at a voxel coordinate
ad.infoAtPoint((40,40,40), 5)


Out[83]:
{'acronym': u'STRd',
 'color': u'98D6F9',
 'id': 485,
 'name': u'Striatum dorsal region',
 'order': 5,
 'parent': 477}

In [84]:
# just the color
ad.colorAtPoint((40,40,40), 5)


Out[84]:
u'98D6F9'

In [86]:
# if i give you an acronym, give me back the ontology branch and leaves

acronym_test = 'HPF'
ac_get = ad.ontologyForAcronym(acronym_test)
ac_get


Out[86]:
{'acronym': u'HPF',
 'color': u'7ED04B',
 'id': 1089,
 'name': u'Hippocampal formation',
 'order': 5,
 'parent': 695}

In [87]:
# load the complete ontology to work with locally
jsn = json.load(open("gridAnnotation_100micron/1.json"))["msg"][0]

In [88]:
# extract a branch, then flatten it
hpf_ontology = extractOntology(jsn, acronym_test)

hpf_flat = {}

parseOntology(hpf_ontology, hpf_flat)

In [93]:
# keys correspond to all ontology ids within the anatomic region defined by the acronym  above

print acronym_test

hpf_flat.keys()

# i think these retain the original ordering as well (haven't checked)


HPF
Out[93]:
[518,
 1037,
 526,
 19,
 20,
 28,
 543,
 550,
 52,
 1080,
 1084,
 1089,
 92,
 1121,
 1133,
 632,
 139,
 664,
 454,
 712,
 715,
 726,
 727,
 10703,
 734,
 742,
 743,
 751,
 758,
 764,
 766,
 259,
 775,
 782,
 790,
 799,
 807,
 815,
 822,
 823,
 312,
 829,
 324,
 837,
 843,
 845,
 853,
 861,
 870,
 60,
 371,
 375,
 382,
 387,
 391,
 909,
 399,
 918,
 407,
 926,
 415,
 419,
 934,
 423,
 431,
 438,
 446,
 10693,
 10694,
 10695,
 10696,
 10697,
 10698,
 10699,
 10700,
 10701,
 10702,
 463,
 10704,
 468,
 982,
 471,
 479,
 486,
 999,
 495,
 502,
 504,
 508,
 509]

In [ ]: